Recombinant production of proteins in yeast

ABSTRACT

A process for the recombinant production of proteins in the yeast Hansenula comprises transforming Hansenula with an expression cassette which comprises the following structural elements encoded: 
     
         L-A-P-GEN 
    
     where 
     L is a leader sequence, 
     A is an adaptor producing an alpha-helix structure, 
     P is a processing signal and 
     GEN is a structural gene for the required protein.

The present invention relates to a process for the recombinantproduction of proteins in the yeast Hansenula.

The recombinant production of proteins in the yeast Hansenula is known.European Patent 173378 describes the recombinant preparation of proteinsusing particular promoter elements of MOX or DAS. However, this documentprovides no information as to how efficient secretion and correctprocessing of the required protein is to be achieved.

Furthermore, it is known that in Hansenula polymorpha, the glucoamylaseleader sequence (GAM1) from Schwanniomyces occidentalis is recognized assignal sequence, and it is possible to secrete correctly processedglucoamylase (G. Gellissen et al., Bio-technology 9 (1991) 291-295).However, this signal sequence does not lead to the secretion of geneproducts foreign to yeasts, for example it is not possible to secretethe protein hirudin therewith.

It is an object of the present invention to provide a process for therecombinant production of proteins, in particular of proteins which areforeign to yeasts, ie. heterologous, in the yeast Hansenula, whichensures efficient secretion and correct processing for a large number ofproteins.

We have found that this object is achieved by a process for therecombinant production of proteins in the yeast Hansenula, whichcomprises transforming Hansenula with an expression cassette whichcomprises the following structural elements encoded:

    L-A-P-GEN

where

L is a leader sequence,

A is an adaptor producing an alpha-helix structure,

P is a processing signal and

GEN is a structural gene for the required protein.

It is possible to use as leader sequence L the leader sequences of allgene products secreted in yeast, which are recognized by Hansenula. Itis not a necessary requirement that the leader sequence originates froma Hansenula gene. Leader sequences of yeasts of genera other thanHansenula are also suitable, for example Saccharomyces orSchwanniomyces. A leader sequence which is very suitable for theinvention is, for example, the alpha factor leader sequence fromSaccharomyces cerevisiae (MATα).

Leader sequences which are preferably used are those of stronglyexpressed and secreted hydrolytic enzymes such as alpha-amylase,invertase, acid phosphatase or glucoamylase. The glucoamylase leadersequence from Schwanniomyces occidentalis is particularly preferablyused.

Suitable sequences as adaptor A are all those which code for apolypeptide which contains an alpha-helix structure. The presence of analpha-helix structure can be determined by the algorithm of Garnier etal. (J. Mol. Biol. 120 (1978) 97-120). It is particularly easy todetermine, using commercially obtainable computer programs based on thisalgorithm, whether a polypeptide sequence ought to have an alpha-helixstructure.

As a rule, sequences which are very suitable as adaptor are all thosefor which the computer program Microgenie® (Beckmann) calculates forALPHA a larger positive value than for the three other possiblestructures (BETA, TURN, COIL) for a peptide sequence of at least fouramino acids in the region of the processing site A-P-GEN.

The length of the adaptor sequence A can vary within wide limits for theuse according to the invention. As a rule, it is from five to onehundred amino acids.

A sequence of the glucoamylase from Schwanniomyces occidentalis whichcontains amino acids 23-72 (GAM 23-72; Dohmen et al. Gene 95 (1990),111-121) is preferably used as adaptor sequence.

This sequence can be used as adaptor sequence directly or, particularlypreferably, after extension at the C terminus by one to four aminoacids. Parts of this sequence, preferably those obtained by N-terminaltruncation, are also very suitable for the process according to theinvention.

It is also possible, for example, by means of the computer programdescribed above, for the sequence regions which particularly contributeto the alpha-helix formation to be identified and also optimized inrespect of the alpha-helix structure by exchange of individual aminoacids.

A sequence which has proven particularly suitable as adaptor for thepreparation of thrombin inhibitors, especially hirudin and hirudinderivatives, by the process according to the invention is the following:

    GAM 23-72-His-Pro-Leu-Glu                                  (SEQ ID NO: 1)

If this sequence (=A) is combined with the leader sequence ofglucoamylase (GAM 1-22) (=L) the result is a leader-adaptor sequencewith the structure GAM 1-72-His-Pro-Leu-Glu comprising 76 amino acids,which is particularly advantageous for the process according to theinvention.

The processing signal P serves to cleave the propeptide to the matureform. Normally, a sequence of basic amino acids is used as processingsignal. A very suitable processing signal is the KEX2 recognition sitefrom S. cerevisiae, which consists of the dipeptide Lys-Arg and is alsorecognized by the yeast Hansenula. This dipeptide can also be used induplicated form as processing signal. The sequence Lys-Arg is preferredas P.

Heterologous and homologous genes can be used as structural gene GEN forthe protein to be produced. The genes can be isolated from theappropriate organisms or prepared by synthesis. In the case of chemicalgene synthesis it is also possible, if required, to adapt the codonusage to the producer organism.

Eukaryotic genes are preferably employed as structural genes. Theprocess according to the invention succeeds particularly well for theproduction of thrombin inhibitors, for example hirudin. This process isalso very suitable for the production of human polypeptides, for examplepeptide hormones, growth factors and lymphokines.

The abovementioned structural elements are arranged in a known manner inthe sequence L-A-P-GEN in an expression cassette. The linkage normallytakes place by ligation of compatible restriction fragments or bychemical synthesis.

The expression cassettes may furthermore contain a number ofconventional regulation signals such as promoters, ribosome bindingsites and terminators, which are functionally connected to thestructural elements L-A-P-GEN according to the invention.

The expression cassette can be part of an autonomously replicating orelse an integrative vector. The construction of an expression vectorusing the expression cassette is described in Example 1.

The yeast Hansenula is transformed with the appropriate expressionvector which contains the expression cassette. This can take place, forexample, by the protocol described in Example 2.

Stably expressing clones which are suitable as producer organism in theprocess according to the invention are isolated from the yeasttransformed in this way. The producer organisms are cultivated underconventional conditions and produce the required protein in aconstitutive or inducible manner depending on the regulation elementsselected. The protein is secreted by the producer organism into thesurrounding medium, from where it can easily be isolated and purified.

Purification from the medium takes place as a rule, after the producerorganism has been removed, by purification processes familiar in proteinchemistry.

The process according to the invention provides correctly processedmature proteins without the faulty processing otherwise observed. Thisprocess therefore leads to a high yield of mature protein andconsiderably facilitates the subsequent purification steps. This processcan therefore be employed particularly well for the production of drugsbased on pharmaceutical proteins.

The following examples explain the invention further.

EXAMPLE 1

Construction of vectors for the secretory expression of recombinantproteins from the yeast strain Hansenula polymorpha

This example describes the construction of expression vectors which areused in the production according to the invention of recombinantproteins in Hansenula polymorpha. The expression cassette used for thispurpose comprises inter alia the following constituents:

Leader: Amino acid 1-22 of the glucoamylase sequence from Schwanniomycesoccidentalis (Dohmen et al. Gene 95 (1990), 111-121)

Adaptor: SEQ ID NO: 1

Processing signal: Lys-Arg

GEN: Thrombin inhibitor gene

Starting from the abovementioned glucoamylase sequence fromSchwanniomyces occidentalis, the GAM sequence from base pair 1 to 207(corresponds to amino acid 1 (Met) to amino acid 69 (Ala) Fig.) wasprepared with the aid of synthetic oligonucleotides and PCRamplification.

Two oligonucleotides with the sequences SEQ ID NO: 2 and NO: 3 wereprepared and used as amplification primers for the PCR.

The resulting GAM leader-adaptor part-fragment was then cut with EcoRIat the 5' end and with PvuII at the 3' end.

For the secretory preparation of hirudin, an adaptor-processingsignal-hirudin gene (A-P-GEN) was prepared starting from the knownhirudin gene with the aid of two synthetic oligonucleotides and PCRamplification. The oligonucleotides used for this had the sequences SEQID NO: 4 and SEQ ID NO: 5.

The amplified DNA fragment was then cut at the 5' end with PvuII and atthe 3' end with SalI.

Ligation of the 3'-end PvuII site of the GAM-leader-adaptorpart-fragment to the 5'-end PvuII site of the A-P-GEN, and ligation ofthis fragment via EcoRI/SalI into pUC18, completed the construct.

The L-A-P-GEN fragment was in turn isolated from this construct asEcoRI/BglII fragment and ligated into the appropriately prepared H.polymorpha expression vector pFMD 13025 (Gellissen G. et al., TIBTECH,10 (1992) 413-417). This entails fusion of the 5' end of L-A-P-GEN tothe H. polymorpha promoter and of the 3' end of the fragment to the H.polymorpha terminator. The expression cassette is now complete and aconstituent of a shuttle vector with which both E. coli, for the purposeof propagation, and the yeast H. polymorpha, for the purpose ofexpression of the foreign gene, can be transformed.

The same L-A-P construction was fused to the gene for the thromobininhibitor rhodniin from Rhodnius prolixus (WO 93/8282) and to the genefor the thrombin inhibitor moubatin from Ornithodorus moubata (WO93/9232). The expression cassettes obtained in this way were employed ina similar way to the hirudin gene fusions for constructing Hansenulapolymorpha expression vectors.

EXAMPLE 2

Transformation of Hansenula polymorpha with the expression vectors

The host strain for the transformation is an auxotrophic mutant obtainedby EMS mutagenesis: a strain with a deficiency fororotidine-5'-phosphate dehydrogenase (ura⁻). The reversion rate of thisuracil mutant can be neglected.

Competent cells of this strain were obtained in the following way(method of Dohmen et al., Yeast 7 (1992) 691-692):

10 ml of yeast complete medium (YPD) were inoculated with the hoststrain and cultivated by shaking at 37° C. overnight. This preculturewas transferred into 200 ml of YPD medium and cultivated by shaking at37° C. until the OD₆₀₀ nm =0.6-1.0. The cells were washed with 0.5 mlvolume of solution A (1M sorbitol, 10 mM bicine pH 8.35, 3% ethyleneglycol) at room temperature and subsequently resuspended in 0.02 volumeof solution A.

After adding 11 μl of DMSO, the aliquots were stored at -70° C. untilthe transformation was carried out.

For the transformation, 10 μg of plasmid DNA and 100 μl of cold 0.1Mcalcium chloride solution were added directly to the frozen competentcells.

After rapid thawing at 37° C., each transformation mixture was incubatedwith 1.0 ml of solution B (40% polyethylene glycol PEG 3350, 200 mMbicine pH 8.35) at 37° C. for one hour. The cells were then washed in 1ml of solution C (150 mM NaCl, 10 mM bicine pH 8.35), resuspended in 200μl of solution C and plated onto selective medium (YNB glucose,complementation of the uracil deficiency by ura⁺ expression plasmids).Incubation took place at 37° C. for 3-5 days.

EXAMPLE 3

Isolation of mitotically stable clones

The recombinant expression plasmids used for transforming H. polymorphaare autonomously replicating and can integrate spontaneously into theyeast genome. They form a multimeric structure therein: the plasmidmonomers are connected together head to tail. Several copies of theexpression cassette therefore contribute to production of therecombinant gene product. The productivity of a recombinant strain islinearly related to the number of integrated expression cassettes over awide range. Multimeric integration of the foreign DNA into the yeastgenome and isolation of mitotically stable clones was achieved in thefollowing way:

The transformants were inoculated from the agar plates with selectivemedium into 3 ml of appropriate liquid medium and passaged, ie. repeatedtransfer into fresh YNB glucose medium (50 μl in 3 ml of medium,cultivations at 37° C.) over a period of 1-2 weeks. During thispassaging, the plasmid DNA integrated into the yeast genome so thatmitotically stable clones were then obtained.

The mitotic stability was tested in the following way:

Three transfers were made from the last passaging culture in YNB glucosemedium into complete medium (YPD) and cultivated at 37° C. for 1-2 days.The diluted culture was then plated onto complete medium and ontoselective medium. Mitotically stable transformants give approximatelythe same number of colonies on the two media. It is thus possible toisolate mitotically stable sub-transformants (Z. A. Janowicz et al.,Yeast 7 (1991) 431-443).

EXAMPLE 4

Expression of foreign gene

For expression studies, the passaged transformants were inoculated into3 ml of YNB medium containing 1% glycerol or 1% methanol in order toinduce MOX or FMD promoters. The cells were cultivated at 37° C. for twodays and then spun down, and the culture supernatant was tested forforeign protein (Western blot, ELISA, activity assay).

50 ml of synthetic medium containing 1.5% glycerol in a 500 mlErlenmeyer flask with baffles were inoculated with efficiently secretingmitotically stable transformants and incubated to OD₆₀₀ nm =10. HPLCanalyses of corresponding culture supernatants showed that the hirudinvariant is completely correctly processed on use of the sequence GAM1-72-His-Pro-Leu-Glu as leader-adaptor.

EXAMPLE 5

Fermentation of recombinant yeast strains

The recombinant yeast strains were fermented in synthetic media(double-concentrated YNB medium2.8 g/l (Difco) containing 10 g/lammonium sulfate) which had been either introduced completely at thestart of the fermentation or were fed in during the fermentation.

The carbon sources employed were glycerol and methanol or mixtures ofglycerol and methanol. The fermentation was started with glycerol as thesole carbon source (≧1% glycerol final concentration in the fermenterduring the initial growth phase).

After sterilization of the medium, it was inoculated with 1 l ofpreculture so that the initial OD₆₀₀ nm was about 1.

The fermentation took place in two phases: an initial growth phase witha higher glycerol concentration (1%) was followed by a production phasewith a lower glycerol concentration (<0.5%) or constant methanolconcentration (1%) or a mixture of glycerol and methanol (0.1-0.4%glycerol and 0.2-1.0% methanol).

The carbon source was fed in where appropriate with various controlpossibilities (continuously or pO₂ -coupled).

During the fermentation there was addition of ammonium sulfate to afinal concentration of 5 g/l, thiamine to a final concentration of 0.1g/l and biotin to a final concentration of 0.3 mg/l.

The pH of the fermentation was kept constant at 4.0 by adding aqueousammonia; the fermentation temperature was 37° C.

The recombinant yeast strains fermented in this way provided a geneproduct (hirudin) which was 100% correctly processed.

BRIEF DESCRIPTION OF THE FIGURE

FIG. 1: Nucleic acid sequence of the GAM-leader-adaptor-processingsignal-hirudin gene fragment and of the polypeptide sequence encodedthereby (reading frame a). The position of the PCR primers is indicated.

    __________________________________________________________________________    SEQUENCE LISTING                                                              (1) GENERAL INFORMATION:                                                      (iii) NUMBER OF SEQUENCES: 5                                                  (2) INFORMATION FOR SEQ ID NO: 1:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 54 amino acids                                                    (B) TYPE: amino acid                                                          (C) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: peptide                                                   (v) FRAGMENT TYPE: internal                                                   (vi) ORIGINAL SOURCE: Schwanniomyces occidentalis                             (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1:                                      AlaProAlaSerSerIleGlySerSerAlaSerAlaSerSerSerSer                              151015                                                                        GluSerSerGlnAlaThrIleProAsnAspValThrLeuGlyValLys                              202530                                                                        GlnIleProAsnIlePheAsnAspSerAlaValAspAlaAsnAlaAla                              354045                                                                        AlaLysHisProLeuGlu                                                            50                                                                            (2) INFORMATION FOR SEQ ID NO: 2:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 33 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: genomic DNA                                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2:                                      GGGGGGGAATTCATGATTTTTCTGAAGCTGATT33                                           (2) INFORMATION FOR SEQ ID NO: 3:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 31 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: genomic DNA                                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3:                                      GGGGGGCAGCTGCATTAGCATCGACAGCAGA31                                             (2) INFORMATION FOR SEQ ID NO: 4:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 56 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: genomic DNA                                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4:                                      GGGGGGCAGCTGCTAAACACCCTCTGGAAAAAAGAGTTGTTTACACTGACTGCACT56                    (2) INFORMATION FOR SEQ ID NO: 5:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 45 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: genomic DNA                                               (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:                                      GGGGGGGTCGACCCTAGATCTCTATTACTGCAGGTATTCTTCCGG45                               __________________________________________________________________________

We claim:
 1. A process for the recombinant production of proteins whichare heterologous in the yeast Hansenula, which comprises transformingHansenula with an expression cassette which comprises the followingstructural elements encoded:

    L-A-P-GEN

where L is a leader sequence, A is an adaptor with the sequence SEQ IDNO:1, P is a processing signal and GEN is a structural gene for therequired protein; growing Hansenula in a suitable growth medium; andrecovering said protein; wherein said protein is correctly processed. 2.A process as defined in claim 1, wherein the leader sequence of theglucoamylase from Schwanniomyces occidentalis is used as leader sequenceL.
 3. A process as defined in claim 1, wherein the peptide sequenceLys-Arg is used one or more times as processing signal P.